iResearch外语学术科研网

检索文献
关键词：	（不输入关键词可显示检索条件下的所有文献！）
发表年份：（四位数字，如1990）	从至
文献类别：	单篇期刊/会议论文学术期刊单本学术著作工具/软件网站
检索范围：	标题摘要文献自带的关键词作者姓名
推荐人：
期刊：
出版社/编辑部：
语言：	中文英文
推荐和/或非推荐：	推荐的经典必读其他非经典文献特别推荐的文献没有被特别推荐的其他文献
如果以上选择了经典必读，那么其话题领域为：	教师教育与发展语言测试与评估翻译学语料库语言学语用学认知语言学比较文学文学理论与文学批评母语/二语习得社会语言学心理语言学其他系统功能语言学教材研究研究主题：其他外国社会文化量化研究质性研究混合研究论文写作

Brown, James Dean. (2014). The Future of World Englishes in Language Testing. Language Assessment Quarterly, 11, 5-26.
[ 详情摘要关键词收藏取消收藏 ]
摘要：This article begins by defining World Englishes (WEs) and the related paradigm of inner-, outer-, and expanding-circle English(es). The discussion then turns to the central concerns of the WEs and language testing (LT) communities with regard to how English tests can best be constructed to include various WEs by discussing (a) what language testers need to understand about WEs (i.e., that the English native speaker norm is no longer sacred and that three different perspectives on English diversity may prove useful in LT) and (b) what language testers need to convey to WEs advocates (i.e., that LT is already contributing to the understanding of linguistic variation, that LT is not ignoring WEs issues, and that LT is much more than the standardized international tests). The article ends with seven recommendations that should make the intersection of WEs and LT more productive. Adapted from the source document
关键词： applied linguistics, language testing and assessment, Language Tests, Language Variation, Language Varieties, New Englishes, English, English as an International Language, Language Diversity, English as a Second Language Tests
Zhang, Limei, Goh, Christine C M, Kunnan, Antony John. (2014). Analysis of Test Takers' Metacognitive and Cognitive Strategy Use and EFL Reading Test Performance: A Multi-Sample SEM Approach. Language Assessment Quarterly, 11, 76-102.
[ 详情摘要关键词收藏取消收藏 ]
摘要：This study investigates the relationships between test takers' metacognitive and cognitive strategy use through a questionnaire and their test performance on an English as a Foreign Language reading test. A total of 593 Chinese college test takers responded to a 38-item metacognitive and cognitive strategy questionnaire and a 50-item reading test. The data were randomly split into two samples (N = 296 and N = 297). Based on relevant literature, three models (i.e., unitary, higher order, and correlated) of strategy use and test performance were hypothesized and tested to identify the baseline model. Further, cross-validation analyses were conducted. The results supported the invariance of factor loadings, measurement error variances, structural regression coefficients, and factor variances for the unitary model. It was found that college test takers' strategy use affected their lexico-grammatical reading ability significantly. Findings from this study provide empirical and validating evidence for Bachman and Palmer's (2010) model of strategic competence. Adapted from the source document
关键词： applied linguistics, language testing and assessment, Metacognition, Language Tests, College Students, Cognitive Processes, English as a Second Language Tests
Knoch, Ute. (2014). Using subject specialists to validate an ESP rating scale: The case of the International Civil Aviation Organization (ICAO) rating scale. English for Specific Purposes, 33(Jan), 77-86.
[ 详情摘要关键词收藏取消收藏 ]
摘要：As part of the English-language proficiency requirements for pilots and air traffic controllers, the International Civil Aviation Organization (ICAO) published a rating scale designed to assess pilots' and air traffic controllers' aviation English proficiency. However, it is not clear how this scale was developed. As part of an attempt to address the need for validation, this paper presents a study involving focus group interviews with pilots. Ten pilots listened to performances of test takers taking a variety of aviation English tests. The pilots were asked to rate the acceptability of the pilot's language for (a) communicating with other pilots and (b) radiotelephony communications with air traffic control. The focus groups had two aims: (1) to establish the 'indigenous' assessment criteria pilots use when assessing the language ability of peers and (2) to establish what level is sufficient as the operational level. The results showed that the pilots focused on some but not all of the criteria on the ICAO scale. Whilst listening to the performances, they also often focused on the speakers' technical knowledge. The paper proposes a model of how industry professionals can be involved in the validation of an LSP rating scales. [Copyright The American University; published by Elsevier Ltd.]
关键词：applied linguistics, language testing and assessment, Rating Scales, English for Special Purposes, English as a Second Language Tests, English Proficiency
Stoynoff, Stephen. (2012). Research agenda: Priorities for future research in second language assessment. Language Teaching, 45(2), 234-249.
[ 详情摘要关键词收藏取消收藏 ]
摘要：In a recent state-of-the-art (SoA) article (Stoynoff 2009), I reviewed some of the trends in language assessment research and considered them in light of validation activities associated with four widely used international measures of L2 English ability. This Thinking Allowed article presents an opportunity to revisit the four broad areas of L2 assessment research (conceptualizations of the L2 construct, validation theory and practice, the application of technology to language assessment, and the consequences of assessment) discussed in the previous SoA and to propose tasks I believe will promote further advances in L2 assessment. Of course, the research tasks I suggest represent a personal stance and readers are encouraged to consider additional perspectives, including those expressed by Bachman (2000), Chalhoub-Deville & Deville (2005), McNamara & Roever (2006), Shaw & Weir (2007), and Stansfield (2008). Moreover, readers will find useful descriptions of current research approaches to investigating L2 assessments in Lumley & Brown (2005), Weir (2005a), Chapelle, Enright & Jamieson (2008), Lazaraton (2008), and Xi (2008). Adapted from the source document
关键词：applied linguistics, language testing and assessment, English as a Second Language Tests, Second Language Tests, Second Language Instruction, English as a Second Language Instruction, Research Design
Gan, Zhengdong. (2012). Complexity measures, task type, and analytic evaluations of speaking proficiency in a school-based assessment context. Language Assessment Quarterly, 9, 133-151.
[ 详情摘要关键词收藏取消收藏 ]
摘要：This study, which is part of a large-scale study of using objective measures to validate assessment rating scales and assessment tasks in a high-profile school-based assessment initiative in Hong Kong, examined how grammatical complexity measures relate to task type and analytic evaluations of students' speaking proficiency in a classroom-based assessment context. An in-depth analysis of oral performance on two different assessment tasks (i.e., monologic vs. interactive) from 30 English as a Second Language, Cantonese-mother-tongue, secondary school students was conducted using a range of measures of grammatical complexity derived from the previous second language (L2) speaking and writing studies. Results showed that the individual presentation task tended to promote not only a greater number of T-units, clauses, verb phrases, and words but also longer T-units and utterances, thus probably stretching learners more in terms of complexity of grammatical and lexical processing. Results also showed that complexity measures recommended as among the most useful complexity measures demonstrated no significant correlations with analytic ratings of learner speaking proficiency. These findings were then discussed in light of the complex, dynamic, and developmental nature of grammatical complexity as well as in light of a learner-, task-, and L2 form-sensitive account of L2 oral production. Adapted from the source document
关键词：applied linguistics, language testing and assessment, Hong Kong, Cantonese, English as a Second Language Tests, Test Validity and Reliability, Oral Language, Complexity, Secondary School Students
Gui, Min. (2012). Exploring differences between Chinese and American EFL teachers' evaluations of speech performance. Language Assessment Quarterly, 9, 186-203.
[ 详情摘要关键词收藏取消收藏 ]
摘要：This study explored whether American and Chinese English as a Foreign Language (EFL) teachers differ in their evaluations of student oral performance by examining the assessments of two groups of raters in an undergraduate speech competition. Each of the 21 contestants presented a 3-min prepared speech on a required topic, responded to a follow-up question, and gave a 1-min impromptu speech on a new topic. Three Chinese and three American EFL teachers rated the speech performances and recorded their comments for the individual contestants as well as for the contestants as a group. Immediately following the competition, the researcher interviewed the raters. The results revealed that American and Chinese EFL raters showed a high degree of agreement on the competition winners and the scores for the contestants. Qualitatively, however, the raters differed in their comments about the students' pronunciation, usage of English expressions, and speech delivery. The Chinese raters unanimously offered positive comments in these three areas, whereas the American raters gave varied and extensive critical comments. These results suggest a need for increased communication between Chinese and American EFL teachers, especially regarding their perceptions of what constitutes good English speech and their pedagogical priorities for oral English instruction. Adapted from the source document
关键词：applied linguistics, language testing and assessment, Second Language Teachers, Teacher Attitudes, Speech Tests, English as a Second Language Tests, Test Validity and Reliability
Huang, Shu-Chen. (2012). Pushing learners to work through tests and marks: Motivating or demotivating? A case in a Taiwanese university. Language Assessment Quarterly, 9, 60-77.
[ 详情摘要关键词收藏取消收藏 ]
摘要：This study focused on the interface between classroom assessment and learning motivation, in particular, whether and how classroom tests and grades motivated student effort. In a university in Taiwan, six English as a Foreign Language teacher interviews were conducted, and 744 student surveys, accompanied by 289 more detailed written opinions, were gathered and analyzed. It was found that teacher considerations in designing classroom tests and assigning nontest grades were associated with intentions to ensure student efforts. Students were generally alert to grade-related requirements but reacted differently. Many indicated the effectiveness of tests in inducing student effort but felt ambivalent about being pushed to study by tests and grades. Teachers should avoid actually demotivating students when their original aim was to motivate. Adapted from the source document
关键词：applied linguistics, language testing and assessment, applied linguistics, English as a second/foreign language instruction, Second Language Teachers, Motivation, English as a Second Language Instruction, College Students, Higher Education, Taiwan, English as a Second Language Tests, Student Attitudes, Teacher Attitudes
Li, Hongli, & Suen, Hoi K. (2012). Are test accommodations for English language learners fair?. Language Assessment Quarterly, 9, 293-309.
[ 详情摘要关键词收藏取消收藏 ]
摘要：Test accommodations have been proposed to help overcome the unfair challenges faced by English Language Learners (ELLs) due to their relatively low English proficiency. A test accommodation is regarded as effective when it improves the test performance of ELLs. However, this improvement raises the question of whether such accommodations give ELLs an unfair advantage. One criterion used in determining a test accommodation's fairness is that it should only remove the disadvantage that ELLs face in regard to their low language proficiency, without giving ELLs any additional advantages. This criterion is met when the test accommodation does not improve the test performance of the non-ELLs when the same accommodation is applied to them. To determine the fairness and, thus, the validity of test accommodations for ELLs, a meta-analysis using hierarchical linear modeling was conducted to compare the effects of test accommodations on the test performance of ELLs and on that of non-ELLs. The results indicated that test accommodations improved ELLs' test performance by about 0.156 standard deviation units but did not discernibly influence the test performance of non-ELLs. This meta-analysis, therefore, constitutes evidence to support the fairness and validity of providing test accommodations for ELLs. Adapted from the source document
关键词：applied linguistics, language testing and assessment, English as a Second Language Tests, Limited English Proficiency, Test Validity and Reliability
Pill, John, & Woodward-Kron, Robyn. (2012). How professionally relevant can language tests be? A response to Wette (2011). Language Assessment Quarterly, 9, 105-108.
[ 详情摘要关键词收藏取消收藏 ]
摘要：This is a response to the commentary 'English Proficiency Tests and Communication Skills Training for Overseas-Qualified Health Professionals in Australia and New Zealand' by Rosemary Wette, published in Language Assessment Quarterly, Volume 8, Issue 2, 2011. Adapted from the source document
关键词：applied linguistics, language testing and assessment, English Proficiency, Communicative Competence, New Zealand, Australia, Health Care Practitioners, English as a Second Language Tests
Vongpumivitch, Viphavee. (2012). Motivating lifelong learning of English? Test takers' perceptions of the success of the General English Proficiency Test. Language Assessment Quarterly, 9, 26-59.
[ 详情摘要关键词收藏取消收藏 ]
摘要：The General English Proficiency Test (GEPT) was developed in accordance with the Taiwanese Ministry of Education's three goals to improve learners' English proficiency, motivate English learning, and promote lifelong learning. This article used questionnaires to investigate the success of the GEPT in meeting these goals. As the GEPT is intended for Taiwanese English as a foreign language, learners from all walks of life (Wu, 2012), both student and non-student GEPT test takers were involved in this study (n = 384). Results showed that although most test takers responded that the GEPT was successful in making them feel that their English has improved, only a slight majority responded that the GEPT was successful in motivating them to learn English. Most test takers did not support the idea that the GEPT was successful in promoting lifelong learning. Probit regression was used to examine the relationships between these verdicts and variables such as test takers' background, motivational influences, feeling toward the GEPT, perceptions toward self-assessment, learner autonomy, and capacities for lifelong learning. Based on the findings, the article argues for a unique place of the GEPT in the Taiwanese context and reflects on the use of tests to promote lifelong learning of a foreign language. Adapted from the source document
关键词：applied linguistics, language testing and assessment, applied linguistics, English as a second/foreign language learning, English as a Second Language Learning, Taiwan, English as a Second Language Tests, Student Attitudes, Motivation
Wu, Jessica R. W. (2012). GEPT and English language teaching and testing in Taiwan. Language Assessment Quarterly, 9, 11-25.
[ 详情摘要关键词收藏取消收藏 ]
摘要：The General English Proficiency Test (GEPT) is a 5-level, criterion-referenced English as a Foreign Language (EFL) testing system implemented in Taiwan to assess the general English proficiency of EFL learners. In 1999, with the aim of encouraging the general study of English and to result in beneficial washback effects on the teaching and learning of English, the Ministry of Education lent its support to the Language Training and Testing Center in the development of the GEPT. Throughout a decade of efforts, the GEPT has won popular recognition in Taiwan. To date, more than 4.3 million Taiwanese have taken the test. This article first documents the evolution of the GEPT from the perspectives of test development and validation. The article then provides an overview of how GEPT scores are used in both educational and professional domains and discusses several key issues and problems that have emerged due to the new context introduced by the GEPT. Finally, the article outlines how the GEPT will address the challenges it faces in the years to come. Adapted from the source document
关键词：applied linguistics, language testing and assessment, English as a Second Language Tests, Taiwan, Test Validity and Reliability
Yin, Muchun. (2012). Scratching where they itch: Evaluation of feedback on a diagnostic English grammar test for Taiwanese university students. Language Assessment Quarterly, 9, 78-104.
[ 详情摘要关键词收藏取消收藏 ]
摘要：Feedback to the test taker is a defining characteristic of diagnostic language testing (Alderson, 2005). This article reports on a study that investigated how much and in what ways students at a Taiwan university perceived the feedback to be useful on an online multiple-choice diagnostic English grammar test, both in general and by students of higher and lower language proficiency. Stage 1 involved questionnaire data from 68 students who rated each item's feedback according to usefulness, and Stage 2 involved interviews with five students as they read the feedback after taking the test. The data from these two stages showed students' overall positive attitude toward the feedback and students' preferences for particular feedback characteristics. The study also found that although higher proficiency test takers found the feedback to be more useful than lower proficiency test-takers, views about the characteristics of good feedback were similar regardless of level. Recommendations for improving diagnostic language test construction and validation are discussed based upon the findings. Adapted from the source document
关键词：applied linguistics, language testing and assessment, Feedback, English as a Second Language Tests, Student Attitudes, Taiwan, College Students, Test Validity and Reliability
Friginal, Eric. (2013). Evaluation of oral performance in outsourced call centres: An exploratory case study. English for Specific Purposes, 32, 25-35.
[ 详情摘要关键词收藏取消收藏 ]
摘要：This case study discusses the development and use of an oral performance assessment instrument intended to evaluate Filipino agents' customer service transactions with callers from the United States (US). The design and applications of the instrument were based on a longitudinal, qualitative observation of language training and customer service support practices of Philippine-based agents employed by a US-owned call centre company. Although language training in Philippine call centers continues to improve (Lockwood, 2012), there are still clear limitations to how the oral performance of Filipino agents is evaluated internally by call centre companies. Specialized assessment instruments, following ESP/EOP norms, broadly used by the industry are still relatively untested and many call centers maintain their own metrics that often measure agents' language use and service quality separately (Friginal, 2007, 2009). In this study, the assessment instrument was adapted from the Melbourne Medical Students' Diagnostic Speaking Scale (Grove & Brown, 2001) and further developed to include ESP/EOP approaches in this context of inter-cultural communication. A conveniently sampled set of recorded calls (N = 100) across different task categories (e.g., troubleshooting interactions, product inquiry) was used to test the instrument for initial reliability measures. Results and analysis of the instrument's context suitability and limitations are discussed below. [Copyright The American University; published by Elsevier Ltd.]
关键词：applied linguistics, language testing and assessment, English for Special Purposes, Business Communication, Native Nonnative Speaker Communication, English as a Second Language Tests, Philippines, Oral Language
Elder, C., Pill, J., Woodward-Kron, R., Mcnamara, T., Manias, E., Webb, G., & Mccoll, G. (2012). Health professionals' views of communication: Implications for assessing performance on a health-specific English language test. TESOL Quarterly, 46(2), 409-419.
[ 详情摘要关键词收藏取消收藏 ]
摘要：This article highlights a preliminary study that is part of a larger research project relating to a specific-purpose English language test for overseas-trained health professionals: the Occupational English Test (McNamara, 1996). The present study highlights findings from the first phase of the project designed to prove the views of spoken communication by doctors, nurses, and physiotherapists, eliciting these views by analyzing clinical experts' feedback on instances of trainee-patient communication proffered independently of the test setting. This research aims to determine "what criteria underlie the judgements of clinical educators regarding the spoken clinical communication of native- and nonnative-English-speaking health professionals." Adapted from the source document
关键词：applied linguistics, language testing and assessment, English as a Second Language Tests, Nonnative Speakers, Native Speakers, Health Care Practitioners, Professional Education, Physicians, Communicative Competence
Murray, J. C., Riazi, A. M., & Cross, J. L. (2012). Test candidates' attitudes and their relationship to demographic and experiential variables: The case of overseas trained teachers in NSW, Australia. Language Testing, 29(4), 577-595.
[ 详情摘要关键词收藏取消收藏 ]
摘要：One measure of the impact of a high-stakes test is the attitudes that test takers hold towards it. It has been suggested that positive attitudes produce beneficial effects while real or anticipated negative experiences can result in the development of attitudes that erode confidence and potentially impact negatively on performance. This study investigated test taker attitudes by exploring the opinions, beliefs, and feelings of a group of overseas trained teachers preparing for a professional gate-keeping test, and examining correlations between attitudes and demographic and experiential factors. The participants were 105 candidates who were enrolled in a preparation course for the Professional English Assessment for Teachers. They were asked to complete a written survey questionnaire with three parts: to determine the nature of their attitude towards the test, to explore the relationship of attitudes and demographic data, and to investigate their perceptions of the sources of their attitudes. Results indicated that there was a slight predominance of negative attitudes, particularly among candidates who had unsuccessfully attempted the test. The main reported sources which correlated with a negative attitude were personal experiences and feelings as well as the impact of other people: notably teachers and other candidates. [Reprinted by permission of Sage Publications, Ltd., copyright holder.]
关键词：applied linguistics, language testing and assessment, Teacher Attitudes, Teacher Education, English as a Second Language Tests, Australia
Pae, T. (2012). Causes of gender DIF on an EFL language test: A multiple-data analysis over nine years. Language Testing, 29(4), 533-554.
[ 详情摘要关键词收藏取消收藏 ]
摘要：This study tracked gender differential item functioning (DIF) on the English subtest of the Korean College Scholastic Aptitude Test (KCSAT) over a nine-year period across three data points, using both the Mantel-Haenszel (MH) and item response theory likelihood ratio (IRT-LR) procedures. Further, the study identified two factors (i.e. reading strategy and perceived interest) that explained a portion of the variance in the magnitude of gender DIF via a series of multiple linear regression analyses. The results indicated (1) an interaction between item type and gender DIF and (2) a significant relationship between gender differences in the examinee's perceived interest in test items and the magnitude of gender DIF. The study discusses the results based on previous DIF research and presents pedagogical implications and some avenues for further research. [Reprinted by permission of Sage Publications, Ltd., copyright holder.]
关键词：applied linguistics, language testing and assessment, Aptitude Tests, Student Attitudes, Higher Education, English as a Second Language Tests, Sex Differences
Stubbe, R. (2012). Do pseudoword false alarm rates and overestimation rates in Yes/No vocabulary tests change with Japanese university students' English ability levels. Language Testing, 29(4), 471-488.
[ 详情摘要关键词收藏取消收藏 ]
摘要：Pseudowords, or non-real words, were introduced to the Yes/No (YN) vocabulary test format to provide a means of checking for overestimation of word knowledge by test takers. The purpose of this study is to assess the assumption that more pseudoword checks (false alarms) indicate more instances of overestimation of word knowledge in YN tests. Thirty English classes in five different Japanese universities with TOEIC(TM) scores ranging from 230 to 730 participated (n = 490). YN test results were compared with a multiple-choice test of the same 96 real words to provide a way to check directly for instances of underestimation and overestimation of word knowledge on the YN tests. Results showed that students from the higher proficiency universities had a slightly higher pseudoword false alarm rate than students from the lower ability universities (4.28% and 3.96%, respectively). However, overestimation rates were considerably lower for these same students from the higher proficiency universities (3.24% and 5.67%, respectively). This discrepancy between false alarm rates and overestimation rates questions the value of pseudowords for measuring overestimation in YN vocabulary tests when student ability levels differ significantly. [Reprinted by permission of Sage Publications, Ltd., copyright holder.]
关键词：applied linguistics, language testing and assessment, Pseudowords, College Students, English Proficiency, Vocabulary, Receptive Language, Higher Education, English as a Second Language Tests
Cho, Y., & Bridgeman, B. (2012). Relationship of TOEFL iBT(TM) scores to academic performance: Some evidence from American universities. Language Testing, 29(3), 421-442.
[ 详情摘要关键词收藏取消收藏 ]
摘要：This study examined the relationship between scores on the TOEFL Internet-Based Test (TOEFL iBT(TM)) and academic performance in higher education, defined here in terms of grade point average (GPA). The academic records for 2594 undergraduate and graduate students were collected from 10 universities in the United States. The data consisted of students' GPA, detailed course information, and admissions-related test scores including TOEFL iBT, GRE, GMAT, and SAT scores. Correlation-based analyses were conducted for subgroups by academic status and disciplines. Expectancy graphs were also used to complement the correlation-based analyses by presenting the predictive validity in terms of individuals in one of the TOEFL iBT score subgroups belonging to one of the GPA subgroups. The predictive validity expressed in terms of correlation did not appear to be strong. Nevertheless, the general pattern shown in the expectancy graphs indicated that students with higher TOEFL iBT scores tended to earn higher GPAs and that the TOEFL iBT provided information about the future academic performance of non-native English speaking students beyond that provided by other admissions tests. These observations led us to conclude that even a small correlation might indicate a meaningful relationship between TOEFL iBT scores and GPA. Limitations and implications are discussed. [Reprinted by permission of Sage Publications, Ltd., copyright holder.]
关键词：applied linguistics, language testing and assessment, English as a Second Language Tests, College Students, Higher Education, English Proficiency, Academic Achievement
He, Ling, & Shi, Ling. (2012). Topical knowledge and ESL writing. Language Testing, 29(3), 443-464.
[ 详情摘要关键词收藏取消收藏 ]
摘要：This study investigates the effects of topical knowledge on ESL (English as a Second Language) writing performance in the English Language Proficiency Index (LPI), a standardized English proficiency test used by many post-secondary institutions in western Canada. The participants were 50 students with different levels of English proficiency (basic, intermediate, and advanced) attending a Canadian college. Each student wrote two timed-impromptu essays: one responding to a prompt requiring general knowledge about university studies and the other pertaining to specific knowledge about federal politics. Results showed that students across three proficiency levels performed significantly better on the general topic than they did on the specific topic. The specific topic produced lower scores on content due to poor quality and development of ideas, implicit position taking, and a weak conclusion. Students also scored lower on organization and language on the knowledge-specific task because of weaker coherence and cohesion, shorter essays, more language errors, and less frequent use of academic words. Post-test interviews confirmed that participating students were challenged by the prompt that required specific topical knowledge. The study draws attention to the importance of developing appropriate prompts for ESL writing tests. [Reprinted by permission of Sage Publications, Ltd., copyright holder.]
关键词：applied linguistics, language testing and assessment, College Students, Canada, Higher Education, English as a Second Language Tests, Writing Tests, Second Language Writing, English Proficiency
Xi, X., Higgins, D., Zechner, K., & Williamson D. (2012). A comparison of two scoring methods for an automated speech scoring system. Language Testing, 29(3), 371-394.
[ 详情摘要关键词收藏取消收藏 ]
摘要：This paper compares two alternative scoring methods -- multiple regression and classification trees -- for an automated speech scoring system used in a practice environment. The two methods were evaluated on two criteria: construct representation and empirical performance in predicting human scores. The empirical performance of the two scoring models is reported in Zechner, Higgins, Xi, & Williamson (2009), which discusses the development of the entire automated speech scoring system; the current paper shifts the focus to the comparison of the two scoring methods, elaborating both technical and substantive considerations and providing a reasoned argument for the trade-off between them. We concluded that a multiple regression model with expert weights was superior to the classification tree model. In addition to comparing the relative performance of the two models, we also evaluated the adequacy of the regression model for the intended use. In particular, the construct representation of the model was sufficiently broad to justify its use in a low-stakes application. The correlation of the model-predicted total test scores with human scores (r = 0.7) was also deemed acceptable for practice purposes. [Reprinted by permission of Sage Publications, Ltd., copyright holder.]
关键词：applied linguistics, language testing and assessment, English as a Second Language Tests, Automatic Speaker Recognition

iReasearch学术科研网

What's Hot

文献追踪

iReasearch学术科研网

What's Hot

文献追踪

What's hot